509 research outputs found

    Sustainable growth in complex networks

    Full text link
    Based on the empirical analysis of the dependency network in 18 Java projects, we develop a novel model of network growth which considers both: an attachment mechanism and the addition of new nodes with a heterogeneous distribution of their initial degree, k0k_0. Empirically we find that the cumulative degree distributions of initial degrees and of the final network, follow power-law behaviors: P(k0)k01αP(k_{0}) \propto k_{0}^{1-\alpha}, and P(k)k1γP(k)\propto k^{1-\gamma}, respectively. For the total number of links as a function of the network size, we find empirically K(N)NβK(N)\propto N^{\beta}, where β\beta is (at the beginning of the network evolution) between 1.25 and 2, while converging to 1\sim 1 for large NN. This indicates a transition from a growth regime with increasing network density towards a sustainable regime, which revents a collapse because of ever increasing dependencies. Our theoretical framework is able to predict relations between the exponents α\alpha, β\beta, γ\gamma, which also link issues of software engineering and developer activity. These relations are verified by means of computer simulations and empirical investigations. They indicate that the growth of real Open Source Software networks occurs on the edge between two regimes, which are either dominated by the initial degree distribution of added nodes, or by the preferential attachment mechanism. Hence, the heterogeneous degree distribution of newly added nodes, found empirically, is essential to describe the laws of sustainable growth in networks.Comment: 5 pages, 2 figures, 1 tabl

    {VoG}: {Summarizing} and Understanding Large Graphs

    Get PDF
    How can we succinctly describe a million-node graph with a few simple sentences? How can we measure the "importance" of a set of discovered subgraphs in a large graph? These are exactly the problems we focus on. Our main ideas are to construct a "vocabulary" of subgraph-types that often occur in real graphs (e.g., stars, cliques, chains), and from a set of subgraphs, find the most succinct description of a graph in terms of this vocabulary. We measure success in a well-founded way by means of the Minimum Description Length (MDL) principle: a subgraph is included in the summary if it decreases the total description length of the graph. Our contributions are three-fold: (a) formulation: we provide a principled encoding scheme to choose vocabulary subgraphs; (b) algorithm: we develop \method, an efficient method to minimize the description cost, and (c) applicability: we report experimental results on multi-million-edge real graphs, including Flickr and the Notre Dame web graph

    Kronecker Graphs: An Approach to Modeling Networks

    Full text link
    How can we model networks with a mathematically tractable model that allows for rigorous analysis of network properties? Networks exhibit a long list of surprising properties: heavy tails for the degree distribution; small diameters; and densification and shrinking diameters over time. Most present network models either fail to match several of the above properties, are complicated to analyze mathematically, or both. In this paper we propose a generative model for networks that is both mathematically tractable and can generate networks that have the above mentioned properties. Our main idea is to use the Kronecker product to generate graphs that we refer to as "Kronecker graphs". First, we prove that Kronecker graphs naturally obey common network properties. We also provide empirical evidence showing that Kronecker graphs can effectively model the structure of real networks. We then present KronFit, a fast and scalable algorithm for fitting the Kronecker graph generation model to large real networks. A naive approach to fitting would take super- exponential time. In contrast, KronFit takes linear time, by exploiting the structure of Kronecker matrix multiplication and by using statistical simulation techniques. Experiments on large real and synthetic networks show that KronFit finds accurate parameters that indeed very well mimic the properties of target networks. Once fitted, the model parameters can be used to gain insights about the network structure, and the resulting synthetic graphs can be used for null- models, anonymization, extrapolations, and graph summarization

    A dissemination strategy for immunizing scale-free networks

    Full text link
    We consider the problem of distributing a vaccine for immunizing a scale-free network against a given virus or worm. We introduce a new method, based on vaccine dissemination, that seems to reflect more accurately what is expected to occur in real-world networks. Also, since the dissemination is performed using only local information, the method can be easily employed in practice. Using a random-graph framework, we analyze our method both mathematically and by means of simulations. We demonstrate its efficacy regarding the trade-off between the expected number of nodes that receive the vaccine and the network's resulting vulnerability to develop an epidemic as the virus or worm attempts to infect one of its nodes. For some scenarios, the new method is seen to render the network practically invulnerable to attacks while requiring only a small fraction of the nodes to receive the vaccine

    Power-hop: A pervasive observation for real complex networks

    Get PDF
    Complex networks have been shown to exhibit universal properties, with one of the most consistent patterns being the scale-free degree distribution, but are there regularities obeyed by the r-hop neighborhood in real networks? We answer this question by identifying another power-law pattern that describes the relationship between the fractions of node pairs C(r) within r hops and the hop count r. This scale-free distribution is pervasive and describes a large variety of networks, ranging from social and urban to technological and biological networks. In particular, inspired by the definition of the fractal correlation dimension D2 on a point-set, we consider the hop-count rto be the underlying distance metric between two vertices of the network, and we examine the scaling of C(r) with r. We find that this relationship follows a power-law in real networks within the range 2<r<d, where d is the effective diameter of the network, that is, the 90-th percentile distance. We term this relationship as power-hop and the corresponding power-law exponent as power-hop exponent h. We provide theoretical justification for this pattern under successful existing network models, while we analyze a large set of real and synthetic network datasets and we show the pervasiveness of the power-hop

    Generic Subsequence Matching Framework: Modularity, Flexibility, Efficiency

    Get PDF
    Subsequence matching has appeared to be an ideal approach for solving many problems related to the fields of data mining and similarity retrieval. It has been shown that almost any data class (audio, image, biometrics, signals) is or can be represented by some kind of time series or string of symbols, which can be seen as an input for various subsequence matching approaches. The variety of data types, specific tasks and their partial or full solutions is so wide that the choice, implementation and parametrization of a suitable solution for a given task might be complicated and time-consuming; a possibly fruitful combination of fragments from different research areas may not be obvious nor easy to realize. The leading authors of this field also mention the implementation bias that makes difficult a proper comparison of competing approaches. Therefore we present a new generic Subsequence Matching Framework (SMF) that tries to overcome the aforementioned problems by a uniform frame that simplifies and speeds up the design, development and evaluation of subsequence matching related systems. We identify several relatively separate subtasks solved differently over the literature and SMF enables to combine them in straightforward manner achieving new quality and efficiency. This framework can be used in many application domains and its components can be reused effectively. Its strictly modular architecture and openness enables also involvement of efficient solutions from different fields, for instance efficient metric-based indexes. This is an extended version of a paper published on DEXA 2012.Comment: This is an extended version of a paper published on DEXA 201

    Behavior of susceptible-infected-susceptible epidemics on heterogeneous networks with saturation

    Full text link
    We investigate saturation effects in susceptible-infected-susceptible (SIS) models of the spread of epidemics in heterogeneous populations. The structure of interactions in the population is represented by networks with connectivity distribution P(k)P(k),including scale-free(SF) networks with power law distributions P(k)kγP(k)\sim k^{-\gamma}. Considering cases where the transmission of infection between nodes depends on their connectivity, we introduce a saturation function C(k)C(k) which reduces the infection transmission rate λ\lambda across an edge going from a node with high connectivity kk. A mean field approximation with the neglect of degree-degree correlation then leads to a finite threshold λc>0\lambda_{c}>0 for SF networks with 2<γ32<\gamma \leq 3. We also find, in this approximation, the fraction of infected individuals among those with degree kk for λ\lambda close to λc\lambda_{c}. We investigate via computer simulation the contact process on a heterogeneous regular lattice and compare the results with those obtained from mean field theory with and without neglect of degree-degree correlations.Comment: 6 figure

    From Cooperative Scans to Predictive Buffer Management

    Get PDF
    In analytical applications, database systems often need to sustain workloads with multiple concurrent scans hitting the same table. The Cooperative Scans (CScans) framework, which introduces an Active Buffer Manager (ABM) component into the database architecture, has been the most effective and elaborate response to this problem, and was initially developed in the X100 research prototype. We now report on the the experiences of integrating Cooperative Scans into its industrial-strength successor, the Vectorwise database product. During this implementation we invented a simpler optimization of concurrent scan buffer management, called Predictive Buffer Management (PBM). PBM is based on the observation that in a workload with long-running scans, the buffer manager has quite a bit of information on the workload in the immediate future, such that an approximation of the ideal OPT algorithm becomes feasible. In the evaluation on both synthetic benchmarks as well as a TPC-H throughput run we compare the benefits of naive buffer management (LRU) versus CScans, PBM and OPT; showing that PBM achieves benefits close to Cooperative Scans, while incurring much lower architectural impact.Comment: VLDB201

    Postopek pridobitve vstopnega vizuma za državljane Bosne in Hercegovine

    Get PDF
    With the rise of online social networks and smartphones that record the user's location, a new type of online social network has gained popularity during the last few years, the so called Location-based Social Networks (LBSNs). In such networks, users voluntarily share their location with their friends via a check-in. In exchange they get recommendations tailored to their particular location as well as special deals that businesses offer when users check-in frequently. LBSNs started as specialized platforms such as Gowalla and Foursquare, however their immense popularity has led online social networking giants like Facebook to adopt this functionality. The spatial aspect of LBSNs directly ties the physical with the online world, creating a very rich ecosystem where users interact with their friends both online as well as declare their physical (co-)presence in various locations. Such a rich environment calls for novel analytic tools that can model the aforementioned types of interactions. In this work, we propose to model and analyze LBSNs using Tensors and Tensor Decompositions, powerful analytical tools that have enjoyed great growth and success in fields like Machine Learning, Data Mining, and Signal Processing alike. By doing so, we identify tightly knit, hidden communities of users and locations which they frequent. In addition to Tensor Decompositions, we use Signal Processing tools that have been previously used in Direction of Arrival (DOA) estimations, in order to study the temporal dynamics of hidden communities in LBSNs

    Search in Complex Networks : a New Method of Naming

    Full text link
    We suggest a method for routing when the source does not posses full information about the shortest path to the destination. The method is particularly useful for scale-free networks, and exploits its unique characteristics. By assigning new (short) names to nodes (aka labelling) we are able to reduce significantly the memory requirement at the routers, yet we succeed in routing with high probability through paths very close in distance to the shortest ones.Comment: 5 pages, 4 figure
    corecore